---
title: "Step Retries | Error Handling | Kastrax Docs"
description: "Automatically retry failed steps in Kastrax workflows with configurable retry policies."
---

# Step Retries ✅

Kastrax provides built-in retry mechanisms to handle transient failures in workflow steps. This allows workflows to recover gracefully from temporary issues without requiring manual intervention.

## Overview ✅

When a step in a workflow fails (throws an exception), Kastrax can automatically retry the step execution based on a configurable retry policy. This is useful for handling:

- Network connectivity issues
- Service unavailability
- Rate limiting
- Temporary resource constraints
- Other transient failures

## Default Behavior ✅

By default, steps do not retry when they fail. This means:

- A step will execute once
- If it fails, it will immediately mark the step as failed
- The workflow will continue to execute any subsequent steps that don't depend on the failed step

## Configuration Options ✅

Retries can be configured at two levels:

### 1. Workflow-level Configuration

You can set a default retry configuration for all steps in a workflow:

```typescript
const workflow = new Workflow({
  name: 'my-workflow',
  retryConfig: {
    attempts: 3,    // Number of retries (in addition to the initial attempt)
    delay: 1000,    // Delay between retries in milliseconds
  },
});
```

### 2. Step-level Configuration

You can also configure retries on individual steps, which will override the workflow-level configuration for that specific step:

```typescript
const fetchDataStep = new Step({
  id: 'fetchData',
  execute: async () => {
    // Fetch data from external API
  },
  retryConfig: {
    attempts: 5,    // This step will retry up to 5 times
    delay: 2000,    // With a 2-second delay between retries
  },
});
```

## Retry Parameters ✅

The `retryConfig` object supports the following parameters:

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `attempts` | number | 0 | The number of retry attempts (in addition to the initial attempt) |
| `delay` | number | 1000 | Time in milliseconds to wait between retries |

## How Retries Work ✅

When a step fails, Kastrax's retry mechanism:

1. Checks if the step has retry attempts remaining
2. If attempts remain:
   - Decrements the attempt counter
   - Transitions the step to a "waiting" state
   - Waits for the configured delay period
   - Retries the step execution
3. If no attempts remain or all attempts have been exhausted:
   - Marks the step as "failed"
   - Continues workflow execution (for steps that don't depend on the failed step)

During retry attempts, the workflow execution remains active but paused for the specific step that is being retried.

## Examples ✅

### Basic Retry Example

```typescript
import { Workflow, Step } from '@kastrax/core/workflows';

// Define a step that might fail
const unreliableApiStep = new Step({
  id: 'callUnreliableApi',
  execute: async () => {
    // Simulate an API call that might fail
    const random = Math.random();
    if (random < 0.7) {
      throw new Error('API call failed');
    }
    return { data: 'API response data' };
  },
  retryConfig: {
    attempts: 3,  // Retry up to 3 times
    delay: 2000,  // Wait 2 seconds between attempts
  },
});

// Create a workflow with the unreliable step
const workflow = new Workflow({
  name: 'retry-demo-workflow',
});

workflow
  .step(unreliableApiStep)
  .then(processResultStep)
  .commit();
```

### Workflow-level Retries with Step Override

```typescript
import { Workflow, Step } from '@kastrax/core/workflows';

// Create a workflow with default retry configuration
const workflow = new Workflow({
  name: 'multi-retry-workflow',
  retryConfig: {
    attempts: 2,  // All steps will retry twice by default
    delay: 1000,  // With a 1-second delay
  },
});

// This step uses the workflow's default retry configuration
const standardStep = new Step({
  id: 'standardStep',
  execute: async () => {
    // Some operation that might fail
  },
});

// This step overrides the workflow's retry configuration
const criticalStep = new Step({
  id: 'criticalStep',
  execute: async () => {
    // Critical operation that needs more retry attempts
  },
  retryConfig: {
    attempts: 5,  // Override with 5 retry attempts
    delay: 5000,  // And a longer 5-second delay
  },
});

// This step disables retries
const noRetryStep = new Step({
  id: 'noRetryStep',
  execute: async () => {
    // Operation that should not retry
  },
  retryConfig: {
    attempts: 0,  // Explicitly disable retries
  },
});

workflow
  .step(standardStep)
  .then(criticalStep)
  .then(noRetryStep)
  .commit();
```

## Monitoring Retries ✅

You can monitor retry attempts in your logs. Kastrax logs retry-related events at the `debug` level:

```
[DEBUG] Step fetchData failed (runId: abc-123)
[DEBUG] Attempt count for step fetchData: 2 remaining attempts (runId: abc-123)
[DEBUG] Step fetchData waiting (runId: abc-123)
[DEBUG] Step fetchData finished waiting (runId: abc-123)
[DEBUG] Step fetchData pending (runId: abc-123)
```

## Best Practices ✅

1. **Use Retries for Transient Failures**: Only configure retries for operations that might experience transient failures. For deterministic errors (like validation failures), retries won't help.

2. **Set Appropriate Delays**: Consider using longer delays for external API calls to allow time for services to recover.

3. **Limit Retry Attempts**: Don't set extremely high retry counts as this could cause workflows to run for excessive periods during outages.

4. **Implement Idempotent Operations**: Ensure your step's `execute` function is idempotent (can be called multiple times without side effects) since it may be retried.

5. **Consider Backoff Strategies**: For more advanced scenarios, consider implementing exponential backoff in your step's logic for operations that might be rate-limited.

## Related ✅

- [Step Class Reference](./step-class.mdx)
- [Workflow Configuration](./workflow.mdx)
- [Error Handling in Workflows](../../docs/workflows/error-handling.mdx)
